Investigative Ophthalmology & Visual Science
● Association for Research in Vision and Ophthalmology (ARVO)
Preprints posted in the last 7 days, ranked by how well they match Investigative Ophthalmology & Visual Science's content profile, based on 22 papers previously published here. The average preprint has a 0.01% match score for this journal, so anything above that is already an above-average fit.
Wells, A.; Boyer, D.; Goldberg, R.; Hohman, T.; Maturi, R.; Patel, S.
Show abstract
Purpose: To evaluate the safety and exploratory outcomes of a single intravitreal injection of OGX110, a peptide agonist of CXCR3, in eyes with persistent fluid secondary to neovascular age-related macular degeneration (nAMD) despite ongoing anti-vascular endothelial growth factor (anti-VEGF) therapy. Methods: This prospective, open-label, sequential dose-escalation phase I study (NCT05904691) enrolled subjects receiving standard-of-care intravitreal anti-VEGF therapy. Subjects received a single intravitreal injection of OGX110 at 0.5 mg, 1.0 mg, or 2.0 mg (n=3 per cohort), 7 to 14 days after the anti-VEGF injection. Results: All nine enrolled subjects completed follow-up through day 56. Two subjects (22%) experienced at least 1 adverse event (AE); all were mild and unrelated to study treatment. Exploratory analyses showed a BCVA change of +1.4 letters following anti-VEGF injection and +4.4 letters from OGX110 baseline to 4 weeks (P < 0.05). Six of 9 subjects gained at least 3 ETDRS letters after OGX110. Anatomic responses were heterogeneous. Four eyes showed a reduction in CRT after anti-VEGF injection that was maintained after OGX110 administration. One additional eye demonstrated a substantial reduction in CRT after OGX110 despite minimal response to anti-VEGF treatment. Conclusions: A single intravitreal injection of OGX110 was well tolerated. Exploratory functional and anatomic findings suggest biologic activity; interpretation is limited by small sample size, open-label design, absence of a concurrent control group, and inter-subject heterogeneity. These results support further study in a controlled trial. Translational Relevance: OGX110 represents a mechanistically distinct investigational approach for nAMD that may warrant further evaluation in eyes with persistent.
Pohlmann-Krappitz, D.; Kaeferstein, I.; Kruse, B.; Winterhalter, S.; Thiel, A.; Pleyer, U.; Braun, J.
Show abstract
Purpose: To characterize peripheral immune alterations in treated birdshot uveitis (BU) patients using high-dimensional mass cytometry and multiplex serology. Design: Cohort study. Subjects: 36 BU patients on immunomodulatory treatment (IMT) and 31 healthy controls (HCs). Methods: Detailed ophthalmologic examinations were performed, and peripheral blood and serum samples were collected for immune profiling using mass cytometry and multiplex cytokine analysis. Main Outcome Measures: Imaging-based indicators of ocular inflammation; peripheral immune cell frequencies; serum cytokine levels. Results: Compared to HCs, BU patients showed increased frequencies of Th17, CD146+ T cells, intermediate effector/central memory T cells co-expressing CXCR3 and CCR4, CD56dim NK cells and elevated IL-18 levels. Patients were clinically stratified by an expert ophthalmologist into three disease activity groups: Inactive, Active (comprising combinations of surface retina, deep retina and choroid activity) and Burned-out. Inactive patients harbored more quiescent effector T cells, e.g. Tim-3+ Tc17-Tc22 intermediates and more CD8+ TSCM, potentially representing a resting pool of autoimmune T cells. Active patients exhibited increased in vivo activation of relevant T cells, with stronger HLA-DR, CD38 or PD-1 expression, and highest levels of CD56dim NK cells. Immune profiles were also linked to treatment subgroups: csDMARDs (conventional synthetic disease-modifying antirheumatic drugs) were associated with higher CD56bright NK frequencies, and absence of therapy showed elevated PD-1/SLAMF7 Tc17+1 and PD-1CD57 CD8 TEMRA cells. IL-6R blockade (tocilizumab) resulted in loss of IL-6R T-cells accompanied by increased SLAMF7 T cells, due to epitope masking. Conclusions: Peripheral CyTOF profiling anchored to thorough clinical stratification revealed disease activity-associated immune signatures and therapy-associated imprints in BU.
Yildiz, E.; Zha, L.; Zebardast, N.; Shi, M.; Wang, M.
Show abstract
Purpose: To predict retinal nerve fiber layer thickness (RNFLT) norms from fundus images. Methods: We selected 18,000 OCT scans and visual fields (VF) from the Massachusetts Eye and Ear Glaucoma Service. A U-Net-based deep learning model was developed to predict RNFLT norms from OCT en face fundus images. A total of 10,000 OCT scans with normal VFs (mean deviation [MD] [≥] -1 dB, glaucoma hemifield test within normal limits, and pattern standard deviation probability > 5%) tested within 30 days were used for training, while the remaining 8,000 OCT scans (mean VF MD: 3.3 +/- 4.9 dB), including 2,419 scans with normal VFs, were used for evaluation. Structure-function correlations between RNFLT maps and VFs were assessed using linear regression and VGG-16 across original RNFLT maps, deviation maps, and their combination. Performance was evaluated using correlation coefficients, mean absolute error (MAE), and R-squared. Results: Predicted RNFLT norm maps showed agreement with baseline RNFLT maps in eyes with normal VFs (R-squared = 0.81 +/- 0.13). RNFLT deviation maps correlated more strongly with VF MD than original RNFLT maps (R = 0.42 vs. 0.19, p < 0.01). In deep learning-based VF prediction, combining original and deviation maps achieved the best performance (MAE = 3.31 dB, R-squared = 0.39), outperforming the model (p < 0.05) using original RNFLT maps alone (MAE = 3.36 dB, R-squared = 0.35). Conclusions: Deep learning can estimate individualized RNFLT norms and improve structure-function assessment in glaucoma. Translational Relevance: Personalized RNFLT norm prediction may improve detection of glaucomatous damage.
Shi, L.; Shi, M.; Chung, I. Y.; Pasquale, L. R.; Shen, L. Q.; Wang, M.
Show abstract
Purpose: To develop and evaluate a deep learning model that predicts optical coherence tomography (OCT)-equivalent retinal nerve fiber layer thickness (RNFLT) maps directly from color fundus photographs and to assess their diagnostic value for glaucoma detection. Design: Retrospective model development and evaluation study. Participants: 15,031 paired fundus photographs and spectral-domain OCT scans collected at Massachusetts Eye and Ear between 2011 and 2022. Methods: Paired fundus and OCT images were used to train a U-Net-based model to predict pixel-wise RNFLT maps with artifact-corrected supervision. Diagnostic performance was evaluated across single-modality models (fundus photos only, real RNFLT maps, predicted RNFLT maps) and multimodal fusion models (fundus + predicted RNFLT maps). Stratified analyses examined model performance across glaucoma severity and demographic subgroups. Glaucoma was defined based on standard criteria applied to Humphrey 24-2 visual field testing. Main Outcome Measures: Mean absolute error (MAE) and structural similarity index (SSIM) for RNFLT map prediction. Area under the ROC curve (AUC) and accuracy for glaucoma detection. Results: RNFLT map prediction achieved a MAE = 15.4 m and a SSIM = 0.65, measured against artifact-corrected RNFLT maps derived from OCT. For glaucoma detection, the predicted RNFLT-only classifier outperformed the fundus-only classifier (AUC 0.889 vs 0.883, p < 0.005; Accuracy 82.0% vs 78.0%), but performed worse than the real-RNFLT-only classifier (AUC 0.889 vs 0.903, p < 0.005). Multimodal fusion of fundus images with predicted RNFLT maps improved performance, achieving an AUC of 0.909, outperforming all single-modality inputs (p < 0.005 vs fundus-only, predicted-RNFLT-only, and real-RNFLT-only). Performance gains between the fundus-only and the multimodal classifier were greater in early-stage glaucoma compared to severe cases: accuracy increased from 55.3% to 64.0% in mild cases, from 71.5% to 80.4% in moderate cases, and from 90.0% to 94.6% in severe cases. Conclusions: Predicted RNFLT maps derived from fundus photographs provide quantitative, OCT-like structural information and improve glaucoma detection. Unlike prior work that predicted only summary RNFLT values, our model generates full RNFLT maps that better support glaucoma classification than fundus images alone. This approach offers a scalable pathway for early glaucoma screening and expands diagnostic access in resource-limited settings.
Okuzumi, N.; Mori, S.; Katakami, K.; Iwaki, Y.; Sakamoto, M.; Yamada, Y.; Nakamura, M.
Show abstract
Purpose: To evaluate the impact of ''not commonly considered risk factors '' on glaucoma surgical outcomes. Methods: This study included 339 eyes that underwent glaucoma surgery. Surgical procedures included microhook ab-interno trabeculotomy (TLO), Preserflo ab-externo microshunt implantation, trabeculectomy (Trab), and Ahmed Glaucoma Valve (AGV) implantation. In addition to conventional background factors, we examined a set of ''not commonly considered risk factors, '' including very elderly age ([≥]85 years), avitreous status, aphakia, use of antithrombotic agents, difficulty attending frequent postoperative visits, small palpebral fissure, corneal endothelial dysfunction, poor vision in the fellow eye, dementia, hearing loss, mental illness, atopic dermatitis, pseudophacodonesis, glaucoma eye drop allergy, and conditions contraindicating {beta}-blocker use. Surgical success was defined as intraocular pressure (IOP) [≤]21 mmHg, [≥]20% reduction from baseline, and no additional glaucoma surgery at 1 year. Logistic regression was performed to identify potential risk factors; significant factors were further evaluated using propensity score matching. Results: Of the 339 cases, surgical success rates were 65% for TLO, 82% for Preserflo, 91% for Trab, and 82% for AGV. Multivariate logistic regression identified two independent predictors of surgical failure: small palpebral fissure (odds ratio 2.52, p < 0.01) and hearing loss (odds ratio 3.94, p = 0.04). Propensity score matching of patients with small versus large palpebral fissures (111 per group) confirmed significantly worse postoperative outcomes in the small-palpebral-fissure group despite balanced baseline characteristics. Conclusion: Small palpebral fissure is an independent and previously unnoticed risk factor for glaucoma surgical failure, affecting both minimally invasive and filtration procedures.
Baek, J. S.; Lokhande, A.; Neuenschwander, D.; Shi, M.; Wang, M.
Show abstract
Purpose To investigate the relative efficacy of nine distinct visual field (VF) denoising artificial intelligence (AI) methods and a pathology-aware AI strategy to discourage over-correction of glaucomatous defects. Design Retrospective study. Participants 87,940 paired visual field (VF) and optical coherence tomography (OCT) samples from a tertiary academic center. Methods Denoising models were trained on a separate VF-only dataset and evaluated on an independent structure-function dataset of paired VF-OCT samples. We implemented and evaluated nine distinct VF denoising strategies representing three broad categories: baseline measurements, self-supervised and image restoration models (including Noise2Noise, Noise2Void, and NAFNet), and latent variable compression-based models (autoencoders and variational autoencoders). All models were designed to reconstruct VF sensitivity maps. We then predicted retinal nerve fiber layer thickness (RNFLT) maps from the denoised VFs using a fixed, independently trained VF-to-RNFLT prediction model. Main Outcome Measures Predicted VF and RNFLT maps and resultant evaluation metrics. Results The raw VF baseline achieved a global R2 of 0.5468 and MAE of 16.83 um. Restoration-based models maintained or slightly improved concordance, with the pathology-aware NAFNet achieving the highest global R2 of 0.5485 and a comparable MAE of 16.82 um. In contrast, compression-based models degraded concordance, with CNN-VAE showing a significant reduction (R2 approximately 0.50). In severe glaucoma, concordance decreased across all methods; however, compression architectures exhibited disproportionately greater degradation compared with restoration-based approaches. Conclusions We present a comparative benchmark of AI-based VF denoising strategies paired with structure-function evaluation. While restoration-based models can reduce variability without loss of biological signal, latent compression risks attenuating clinically meaningful defects. Visually smoother fields are not necessarily more biologically accurate.
Callet, C.; Bertrand, M.; Guzman, K.; Mece, P.; Rossi, E. A.; Grieve, K.
Show abstract
The retinal nerve fiber layer, composed of axon bundles converging toward the optic nerve, is a key biomarker for diagnosing and monitoring glaucoma and other neurodegenerative diseases. High-resolution en face imaging of individual nerve fiber bundles offers morphological information beyond what conventional optical coherence tomography provides, yet clinical integration remains limited by the lack of automated analysis tools and normative data. Here, we imaged 14 healthy volunteers using time-domain full-field optical coherence tomography and adaptive optics scanning laser ophthalmoscopy, and developed automated pipelines to quantify bundle width, trajectory, tortuosity, and orientation. Bundles were on average 25% wider at shallower retinal depths, width measurements were consistent across imaging modalities, and estimated axon count per bundle decreased significantly with age. Global trajectory analysis revealed systematic deviations of high resolution data from existing mathematical models, particularly in the temporal sector, leading us to propose two refined trajectory models. These normative results provide a foundation for high resolution biomarkers for use in investigations of retinal neurodegeneration.
Chagas Ferreira, M. C.; Pellegrini, M. A.; Sequeira, B. J.
Show abstract
Background: Refractive errors are the leading cause of preventable visual impairment worldwide, yet data from isolated Indigenous populations remain virtually absent from the global literature. The Yanomami, one of the largest Indigenous peoples in the Americas with recent and limited contact with non-Indigenous society, have no prior epidemiological data on refractive errors. Methods: A cross-sectional observational study was conducted in 2024 at the Yanomami Indigenous Health House, Boa Vista, Roraima, Brazil. A total of 158 self-identified Yanomami individuals aged 5 years or older were examined by an ophthalmologist. Refractive status was classified according to International Myopia Institute criteria. Results: Emmetropia was observed in 67.7% of participants, with a marked age-related decline from 100% in children aged 5 to 9 years to 38.6% in those aged 40 to 59 years. Myopia was present in 16.5% of participants, all low myopia; it was absent in children under 10 years and no high myopia was identified. Astigmatism affected 24.1% of participants and hyperopia 13.3%. Presbyopia was identified in 25.9%. Overall, 25.3% of participants presented with reduced visual acuity attributable to uncorrected refractive error, of whom 67.5% improved to normal or near-normal acuity (p < 0.001). Conclusions: This is the first characterisation of the Yanomami refractive profile, revealing a distinct myopia pattern shaped by high outdoor exposure and minimal near-work demands. Despite this, refractive correction remains effectively inaccessible to this population, leaving preventable visual impairment unaddressed and reflecting a profound health inequity. Corrective lens provision represents a high-impact, scalable intervention for this underserved community.
Wood, A. M.; Detwiler, R. E.; Coughlin, M.; Pollard, C. E.; Alt, J. A.; Pulsipher, A.; Kramer Stratton, J.
Show abstract
Background: Chronic rhinosinusitis (CRS) is a heterogeneous inflammatory airway disease associated with impaired mucociliary clearance and persistent inflammation. While prior work has focused on inflammatory and molecular pathways, the physicochemical properties of mucus itself remain poorly characterized. This study aimed to define compositional and biophysical features of CRS mucus that may contribute to dysfunction. Methods: A prospective cross-sectional study was conducted in 15 adults undergoing endoscopic sinus surgery (11 CRS, 4 controls). Mucus was collected from the middle meatus. Hydration was measured by lyophilization. Ionic composition was quantified using mass spectrometry. Viscoelasticity was assessed via oscillatory shear rheology. Total protein, total carbohydrate, sialic acid (Sia) and fucose (Fuc) content were quantified using enzymatic and chemical assays. Statistical comparisons were performed using nonparametric tests. Results: CRS mucus exhibited significantly higher Ca2+; and Mg2+; concentrations (approximately two-fold; p<0.05) and increased variability in hydration and ion content compared to controls. Rheology showed greater heterogeneity and a non-significant trend toward increased viscoelasticity in CRS. Total protein and carbohydrate content were not significantly different; however, the carbohydrate-to-protein ratio was significantly reduced in CRS (p=0.04). Sia content and Sia-to-carbohydrate ratio were significantly elevated in CRS (p=0.04 and p=0.002), particularly in CRS with nasal polyps. Fuc content did not differ between groups. Conclusions: CRS mucus demonstrates coordinated alterations in ionic composition and glycosylation, characterized by increased cation content, hypersialylation, and reduced carbohydrate-to-protein ratios. These changes may contribute to altered mucus properties and impaired mucociliary clearance, highlighting mucus composition as a potential therapeutic target in CRS.
Lee, S. S.-Y.; Wang, C. A.; de Vries, V. A.; van Hemert, D. J.; Schulze, A.; Brandl, C.; Aman, A. M.; Alonso-Caneiro, D.; Choquet, H.; Gorski, M.; Hammond, C. J.; Heid, I. M.; Hunter, M. L.; Hysi, P.; Jiang, C.; Jonas, J.; Klaver, C. C.; Kneepkens, S.; Konig, S.; Lingham, G.; Luber, C.; Melton, P. E.; Pennell, C. E.; Ramdas, W. D.; Read, S. A.; Schuster, A. K.; Wang, Y. X.; Zimmermann, M. E.; International Glaucoma Genetics Consortium, ; Khawaja, A. P.; Gharahkhani, P.; MacGregor, S.; Guggenheim, J. A.; Mackey, D. A.
Show abstract
The choroid is critical for maintaining vision and implicated in several ocular diseases, being the sole source of nutrients and waste removal for the outer retina. Genetic discovery can help elucidate the pathways through which choroidal features influence disease risk. Our meta-analysis of genome-wide association studies (n= 78,682 participants) identified 30 genomic regions, including 20 novel loci, associated with choroidal thickness. Findings suggest inflammatory and vascular processes drive choroidal thickness, with overlapping mechanisms shared with refractive error. Genome-wide independently significant SNPs accounted for 18.7% of the genetic variance in choroidal thickness. Mendelian randomisation analyses showed a causal effect of age-related macular degeneration on choroidal thickness, and suggest a bidirectional causal effect between choroidal thickness and primary angle-closure glaucoma. These findings provide insight into the shared genetic architecture and biological pathways linking choroidal thickness and related diseases.
Wang, E.; Kohli, A.; Taha, H. B.
Show abstract
Background: Frontotemporal dementia (FTD) lacks widely accessible disease-specific biomarkers. Optical coherence tomography (OCT) and OCT angiography (OCTA) may provide non-invasive measures of retinal changes associated with neurodegeneration. We conducted a systematic review and meta-analysis evaluating retinal biomarkers in FTD compared with Alzheimer disease (AD) and controls. Methods: A systematic search of PubMed and Embase was conducted through April 25, 2026 according to PRISMA guidelines. Studies evaluating OCT/OCTA biomarkers in FTD with comparator groups were included. Inverse weighted random-effects models, publication bias assessments, and meta-regressions were performed. Results: Ten studies involving 139 individuals with FTD, 87 with AD, 29 with mild cognitive impairment, 14 with TDP-43 proteinopathy, 5 with tauopathy, and 255 controls were included in the systematic review; five studies were eligible for meta-analysis. Compared with AD, individuals with FTD demonstrated significantly thinner retinal nerve fiber layer (RNFL) thickness (SMD = -0.61, 95% CI -0.98, -0.24). Compared with controls, individuals with FTD exhibited significantly thinner ganglion cell layer-inner plexiform layer (GCL-IPL) thickness (SMD = -0.55, 95% CI -1.02, -0.08), whereas pooled analyses across multiple retinal biomarkers were non-significant (SMD = -0.19, 95% CI -0.52, 0.14). RNFL thickness correlated negatively with female % in FTD and positively with age in both AD and controls. Conclusions: Individuals with FTD exhibit lower RNFL thickness than AD and lower GCL-IPL thickness than controls, suggesting retinal alterations may reflect neurodegeneration. However, larger longitudinal studies with standardized OCT/OCTA protocols are needed to determine the diagnostic and prognostic utility of retinal biomarkers in FTD
Dias, Y.; Gebrekidan, F.; Lowder, J.; Sutcliffe, S.; Yaeger, L.
Show abstract
ABSTRACT OBJECTIVE: We performed a systematic review and meta-analysis (SRMA) of post-surgical outcomes, comparing chlorhexidine gluconate (CHG) versus povidone iodine (PI) for vaginal antisepsis of major gynecologic procedures. DATA SOURCES: Ovid Medline, Embase, Scopus, Embase, Cochrane, and Clinicaltrials.gov were searched between 1986 and December 2023, for studies comparing CHG with PI for vaginal antisepsis of major gynecologic operations. STUDY ELIGIBILITY CRITERIA: We included Randomized Controlled Trials (RCTs) and non-RCTs comparing CHG to PI for vaginal antisepsis of major gynecologic operations. The primary outcome was surgical site infections (SSIs) and the secondary outcome was urinary tract infections (UTIs) and vaginal irritation. METHODS: Summary estimates were calculated by fixed effects models when I2 [≤] 25% and by random effects models when I2 > 25%. Statistical analysis was performed using RevMan 5.4.1. The protocol for this systematic review was registered on PROSPERO (ID CRD42022378101). RESULTS: Nine studies met the inclusion criteria, four of which were randomized controlled trials (RCTs). 9538 patients were included, 4300 (45%) of whom were allocated to CHG and 5238 (55%) to PI. No statistically significant difference in SSI incidence was found for vaginal antisepsis with CHG versus PI in pooled analyses (n= 9538 patients; RR 1.20; 95% CI 0.92-1.57; I2 =0%). In contrast, a significantly higher risk of UTIs was observed for vaginal antisepsis with CHG than with PI (n=6061 patients; RR 1.48 95% CI 1.03-2.14; I2 = 0%). CONCLUSION: In our SRMA, there were no significant differences in SSI risk when either CHG or PI was utilized for antiseptic vaginal preparation. Interestingly, vaginal antisepsis with PI was associated with a lower incidence of post-operative UTIs following major gynecologic surgery. Our findings support current guidelines that form of vaginal antisepsis can be used for SSI prevention. They also suggest that PI may result in fewer postoperative UTIs but further randomized studies are needed to support these findings. Key words: surgical site infection, surgical wound infection, urinary tract infection, urogynecologic surgery, Chlorhexidine, Povidone Iodine, surgical antiseptic,
Yang, Y.; Peracchio, L.; Mayourian, J.; Miller, T.; La Cava, W.
Show abstract
Background Artificial intelligence-enhanced electrocardiography (AI-ECG) enables scalable, low-cost cardiac dysfunction screening, but existing models are annotation-intensive and predominantly adult-derived, leaving paediatric generalizability uncertain. Paediatric cohorts exhibit highly variable cardiac morphology and function compared to adults, which may be useful for learning generalizable AI-ECG models. Methods We pretrained ECG-Fyler on a predominantly paediatric, all-age cohort at Boston Children's Hospital (1992-2023), annotated with a cardiology-specific coding system (Fyler codes), and evaluated it on assessments from echocardiography (echo) and cardiac magnetic resonance (CMR) studies. We validated on an external adult cohort from Columbia University Irving Medical Center. Performance was benchmarked against several AI-ECG foundation models by AUROC across age groups, lesion types, and limited-data scenarios. Findings The pretraining cohort comprised 782,138 ECGs from 255,271 patients (median age: 10.9 years, IQR: [2.8-16.8]). Internal evaluation included 178,495 ECG-echo pairs (median age: 10.9 [3.7-17.0]) and 8,584 ECG-CMR pairs (median age: 20.7 [15.6-29.6]). External validation included 82,543 ECG-echo pairs from adults (median age: 64.0 [52.0-74.0]). ECG-Fyler improved AUROC across biventricular dysfunction and dilation tasks, with the largest gains in low-data settings. In internal validation, ECG-Fyler detected low left ventricular ejection fraction (LVEF [≤] 40%) from only 100 fine-tuning samples (AUROC: 0.80, 95% CI: [0.78-0.80]), outperforming other models (AUROC < 0.65) and improving with additional fine-tuning (AUROC: 0.94 [0.93-0.94]). Similar improvements were observed for CMR-derived LVEF, RVEF, and ventricular dilation. In external validation on adults, ECG-Fyler exhibited an AUROC of 0.83 (CI: [0.82-0.85]) for LVEF [≤] 40%. After fine-tuning on less than 10% of external data, LVEF [≤] 45% performance (AUROC: 0.87 [0.86-0.88]) outperformed a fully trained, site-specific prior model (AUROC: 0.85 [0.84-0.87]). Interpretation Pretraining on richly annotated, paediatric-dominant ECGs yields models that transfer efficiently across institutions and ages, supporting AI-ECG screening and triage when labels or imaging access are limited. Funding National Institutes of Health (R01LM012973); Kostin Innovation Fund, Boston Children's Hospital
Tuttle, M.; Maas, C. C. H. M.; An, J.; Wessler, B. S.; Harvey, W. F.; Selker, H. P.; van Klaveren, D.; Kent, D. M.
Show abstract
The Epic Sepsis Model version 2 (ESMv2) is a prediction model embedded into the electronic medical record used to warn clinicians which hospitalized patients are at risk for sepsis. We conducted a retrospective cohort study of 31,951 hospitalizations of 25,760 patients to compare analyses conducted at the commonly used patient-level (where a maximum prediction prior to the onset of sepsis is used to measure performance) vs novel prediction-level (where each prediction is used to measure performance). Sepsis, defined by the Sepsis 3 criteria occurred during 1,049 hospitalizations (3.3%). Patient-level analyses suggested excellent discrimination AUC 0.86; [IQR 0.85, 0.87], whereas prediction-level analyses demonstrated lower performance AUC 0.62; [IQR 0.57, 0.65]. Low estimates of the positive predictive value (14.5% at the patient level vs 4% at the prediction level) imply a high number of false alerts. Common evaluation approaches may overstate the performance of dynamic prediction models and mislead clinical decision-making.
Hoang, N.; Yang, H.; Uddin, M. N.; Zhong, J.; Faiyaz, A.; Singh, M. V.; Boodoo, Z. D.; Sutton, K. R.; Wang, H. Z.; Sahin, B.; Khan, M. W.; Weber, M. T.; Yuan, C.; Chen, L.; Schifitto, G.
Show abstract
Background: Despite the success of combination antiretroviral therapy (cART), vascular comorbidities, including cerebrovascular disease, are more prominent in people living with HIV (PLWH) compared to people without HIV (PWOH). However, quantitative assessments of cerebrovascular morphometry and their associations with cognitive outcomes in the context of HIV are still limited. In this study, we explore this missing link. Methods: Magnetic Resonance Angiography (MRA) data, blood markers, and neurocognitive assessments were collected from 73 PWOH subjects (male: 57, female: 16; age: 53 {+/-} 16) and 99 PLWH subjects (male: 66, female: 30, age: 53 {+/-} 11). Vessel morphometric features were quantified using intraCranial Artery Feature Extraction (iCafe) to investigate associations between vessel morphometry, markers of monocytes, endothelial cell activation, and cognitive performance. Results: HIV status predicted a lower total number of branches ({beta} = -0.224, p = 0.001, d = -0.517) and shorter total distal length ({beta} = -0.173, p = 0.021, d = -0.370) with a moderate effect size. Total branch number was found to be negatively associated with plasma levels of monocyte markers (sCD14: r = -0.167, p = 0.033; sCD163: r = -0.157, p = 0.045) and positively correlated with white matter cerebral blood flow (r = 0.550; p [≤] 0.05). HIV status was the strongest predictor of overall cognitive performance in ANCOVA model ({beta} = -0.219, p = 0.006, d = -0.453). Conclusions: Our results suggest that cognitive impairment in PLWH is associated with vessel morphology metrics. Monocyte immune activation may contribute to changes in vessel morphology.
Reteig, L. C.; Woloshin, S.; Maglione, P. J.; Farmer, J. R.; Ong, M.-S.
Show abstract
Patients with primary immunodeficiency (PID) often face prolonged diagnostic delays and may increasingly turn to large language models (LLMs) to interpret their symptoms during this period. We evaluated whether an LLM could recognize PID from symptom descriptions derived from interviews with 21 PID patients. In a prior study, we showed that GPT-4o identified PID in 96% of cases when prompted with physician-written patient histories (Rider et al., JACI, 2024). Here, when prompted with symptom descriptions in patients' own words, GPT-5 identified PID in only 7 cases (33%), although it more broadly suggested immune system issues in 18 cases (81%). The gap between these findings indicates that LLMs are sensitive to the language and framing of symptom descriptions, performing substantially worse when patients describe their own symptoms in everyday language than when clinicians summarize patient histories in structured medical terms. This study underscores the need to carefully evaluate how LLMs are used in patient-facing applications.
Deng, Z.; Wang, Y.; Shi, Y.; Wang, L.; Qureshi, T. A.; Gaddam, S.; Javed, S.; Hsu, Y.-C.; De Righi, D. R.; Azab, L.; Diwan, G.; Yang, J. D.; Xie, Y.; Yuan, C.; Vendrami, C. L.; Rodriguez, A.; Specht, K.; Jeon, C. Y.; Chaudhry, H.; Buxbaum, J.; Pisegna, J. R.; Yaghmai, V.; Goessling, W.; Hernandez-Barco, Y. G.; Miller, F. H.; Tirkes, T.; Espinoza, S.; Musi, N.; Dey, D.; Sung, K. H.; Pandol, S. J.; Li, D.
Show abstract
Biological aging is heterogeneous across organ systems, yet whether CT-derived abdominal aging provides prognostic value beyond routine clinical data and whether organ decomposition adds beyond a unified estimate remains untested. We developed and evaluated organ-specific and ensemble biological age models from radiomic features across five abdominal organs in 68,675 CT scans from 32,883 subjects, evaluated on alignment with chronological age of healthy subjects (nested cross validation: MAE=3.68 years, R^2=0.90). In sequential analyses restricted to adults aged 20-60 years which is the stratum of strongest BAG-disease association, ensemble biological age gaps provided incremental prognostic value beyond demographic covariates for all-cause disease and mortality (Delta C-index=0.141, 0.051) and beyond routine blood biomarkers (Delta C-index=0.048), confirming CT-derived aging captures structural information beyond laboratory markers. Organ-specific biological age added incremental prognostic value beyond ensemble selectively for focal diseases: cardiovascular (aorta, Delta C-index=0.091) and hepato-pancreatic (pancreas, Delta C-index=0.096). These findings establish a hierarchical organization of CT-derived biological aging, positioning routine CT as a source that adds prognostic value to existing clinical biomarkers.
Haynes, A.; Mynard, J. P.; van der Veen, M.; Carson, J.; Green, D. J.
Show abstract
Intro: Characteristics of the pulse wave transmitted through the carotid arteries are predictive of cognitive decline and cerebrovascular health in humans. This study aimed to identify risk factor trajectories in childhood, adolescence and early adulthood that are associated with forward compression wave intensity (FCWI) in the common carotid artery in adults aged 28 years. Methods: Systolic blood pressure (SBP), body mass index (BMI) and fasting blood glucose (FBG) measured at multiple time-points when participants were aged between 8-20 years were included in a trajectory analysis. At age 28 years, FCWI was measured in 402 (M=206, F=196) participants who underwent a Duplex ultrasound assessment of the common carotid artery. Statistical analysis assessed differences in FCWI between each trajectory group for males and females separately. Results: In males, four trajectory groups were identified for BMI, three for SBP, and two for FBG. In females, three trajectory groups were identified for BMI, SBP, and FG. In males, having higher BMI (P=0.006), SBP (P=0.021) and FBG (P=0.002) from ages 8-20 years was associated with greater FCWI at age 28 years. In females, no associations were found between FCWI at age 28-years and trajectory groups for BMI (P=0.185), SBP (P=0.289) or FBG (P=0.070). Conclusion: Having high BMI, SBP and FBG throughout childhood, adolescence and early adulthood was associated with higher FCWI in the carotid artery at age 28 years in males, but not females. This may have a direct impact on the etiology of cognitive decline and cerebrovascular disease in later life.
Marshall, A. T.; Kan, E.; Adise, S.; König, M.; McConnell, R.; Martinez, M.; Midya, V.; Arora, M.; Sowell, E. R.
Show abstract
Lead is a toxic metal ubiquitous in our environment. While dramatic reductions in lead sources have paralleled equivalent decreases in lead-poisoning rates, chronic lead exposure remains a critical public health concern. Childhood lead exposure (at its lowest levels) is liked to changes in cognitive development but less is known about lead's effects on children's brain structure, especially as a result of in utero exposure. We measured prenatal and early-postnatal lead exposure in shed deciduous teeth of 448 9- and 10-year-old children (from 20 United States cities) and linked those lead levels to childhood brain structure, cognition/behavior, and neighborhood- and family-level socioeconomic characteristics. Here we show negative associations between tooth-lead levels and the thickness of the brain's cortex, particularly in regions linked to language processing. With increasing tooth-lead levels, children of lower-income (versus higher-income) families showed steeper declines in receptive vocabulary. Caregiver-reported behavioral problems exhibited similar associations. With in utero exposure linked to adverse neurodevelopmental outcomes (well before lead exposure and its risks are evaluated by healthcare professionals), prenatal screening of maternal lead levels/exposure, coupled with recommended strategies to reduce its placental transmission, may help reduce lead's effects on future generations.
Periwal, V.
Show abstract
Background: Conventional psychiatric screening instruments summarize symptoms within individual scales and prioritize cases with high single-instrument additive score severity. This design treats items as independent within instruments and ignores cross-instrument covariance structure, making it insensitive to respondents whose responses are distributed across multiple domains in unusual combinations that remain below threshold on every individual scale. Methods: We analyzed two cohorts spanning older and younger adults. Item prompts from depression, stress, anxiety, and sleep instruments were embedded into a shared semantic space using a pretrained sentence encoder. Principal component analysis of the item-prompt embeddings alone---with no use of respondent data at this stage---was used to construct a low-dimensional subspace retaining 80\% of variance in the item embedding matrix. Normalized participant responses were then projected into this subspace, with Jaccard-based stability analysis used as a check on dimensional robustness. Multivariate deviation from the cohort norm was quantified with Mahalanobis distance using Ledoit-Wolf covariance regularization. Candidate outliers were defined by the empirical 95th percentile of the cohort-specific distance distribution. To isolate response configurations not already captured by conventional single-instrument extreme-value logic, we excluded all outlier respondents who had endorsed any individual item at the maximum value of its Likert scale on any instrument. For the remaining outliers, anomalous components were backtracked to their original item loadings for interpretation. Results: In the older-adult Health and Retirement Study (HRS) cohort, principal component analysis of 27 item-prompt embeddings showed that a 10-dimensional subspace provided a stable representation of cross-instrument semantic structure. In the younger-adult Xinxiang cohort the corresponding stable solution was 16-dimensional. In each cohort, seven respondents remained as multivariate outliers despite falling below every single-instrument extreme-value threshold. These cases were not characterized by uniformly severe symptom scores but by unusual cross-domain response configurations that became visible only in the shared semantic covariance subspace. The response structure of the retained configurations differed across cohorts: older-adult cases more often involved weak endorsement of mood-labeled items alongside nonzero body- and sleep-related responses, whereas younger-adult cases more often involved incomplete response configurations spanning mood, sleep, stress, and self-harm-related items. Conclusions: A semantically aligned, auditable covariance subspace provides a practical tool for flagging unusual multivariate response configurations that single-instrument additive screening may not flag. The method is interpretable at the level of original item contributions. It should be understood as a hypothesis-generating screen for unusual response configurations requiring further clinical assessment, not as a diagnostic instrument. Outcome validity remains to be established by prospective study.